Comparing and Extracting Paraphrasing Words with 2-Way Bilingual Dictionaries
نویسندگان
چکیده
We analyze a variety of lexical expressions with 2-way bilingual dictionaries and propose a method for extracting paraphrasing words. First, we compare the coverage between an English-Japanese dictionary and a Japanese-English dictionary from the viewpoint of the returnability of the words by translating English to Japanese, and then back to English again. The variety is shown using examples. Next, we propose a method of automatically extracting English paraphrasing word groups; we gathered the English index words which have the same Japanese translation words in the E-J dictionary. The English words which are difficult to distinguish for native speakers of Japanese were then extracted into a paraphrasing group. We also extract the Japanese paraphrasing word groups for comparison. This method will be useful for sentence matching, especially in order to accept the variety of expressions.
منابع مشابه
Automatic Construction of a Japanese-Chinese Dictionary via English
This paper proposes a method of constructing a dictionary for a pair of languages from bilingual dictionaries between each of the languages and a third language. Such a method would be useful for language pairs for which wide-coverage bilingual dictionaries are not available, but it suffers from spurious translations caused by the ambiguity of intermediary third-language words. To eliminate spu...
متن کاملExtracting Bilingual Persian Italian Lexicon from Comparable Corpora Using Different Types of Seed Dictionaries
Ebrahim Ansari ([email protected]) et al. 2017. Extracting bilingual per-sian italian lexicon from comparable corpora using different types of seed dictionaries. In " Applications of Comparable Corpora " edited book Berlin Linguistic Press (ed.). Bilingual dictionaries are very important in various fields of natural language processing. In recent years, research on extracting new bilingual lex...
متن کاملModel in Word
Extracting bilingual dictionaries from corpora can be seen as a very fine-grained alignment process, where the aligned units are not paragraphs or sentences but words and phrases. Most approaches to this problem rely on statistical means to build translation lexicons from bilingual texts, roughly falling into two categories: the hypotheses testing approach and the estimating approach. There are...
متن کاملBilingual Text, Matching using Bilingual Dictionary and Statistics
This paper describes a unified framework for bilingnal text matching by combining existing hand-written bilingual dictionaries and statistical techniques. The process of bilingual text matching consists of two major steps: sentence alignment and structural matching of bilingual sentences. Statistical techniques are apt plied to estimate word correspondences not included in bilingual dictionarie...
متن کاملUtilizing Contextually Relevant Terms in Bilingual Lexicon Extraction
This paper demonstrates one efficient technique in extracting bilingual word pairs from non-parallel but comparable corpora. Instead of using the common approach of taking high frequency words to build up the initial bilingual lexicon, we show contextually relevant terms that co-occur with cognate pairs can be efficiently utilized to build a bilingual dictionary. The result shows that our model...
متن کامل